align with grpc base/balancer to trigger reconnect in Idle state (when we move to gRPC 1.41+) #335

ddowker · 2023-03-16T06:49:19Z

When a k8s rolling restart occurs (e.g. kubectl rollout restart deployment gazette) we see the broker's IP addresses change and the consumers need to be restarted to re-establish their GRPC connections. Without a restart of the consumers they have their default service SubConns waiting forever in the connectivity Idle state on their open Read GRPCs.

This change adds extra Idle state handling to the dispatcher's UpdateSubConnState method to align with the GRPC base balancer's implementation (https://github.com/grpc/grpc-go/blob/master/balancer/base/balancer.go#L180) where it tries to connect if in the Idle state.

The end result of this change is that connectivity is re-established from the consumers to the brokers (after the brokers have restarted) but these new connections often go via the default service SubConn and so do not honor the intra-zone routing preferences (as the Pick method is called without a Route) as outlined below. The changes do not appear to affect the existing setting up of SubConns and improve the robustness under restarts (though not perfect from a zone perspective).

Prior to the change:

After initial system bringup or restarting of the consumers the GRPC connections are following the expected intra-zone routing (in the case below the consumer is in zone: us-central1-b):

2023/03/16 04:25:08.256374	111.300601	/protocol.Journal/Read
04:25:08.256376	 .     2	... RPC: to 10.40.1.42:8080 deadline:none
04:25:08.256382	 .     6	... Pick(Route: members:<zone:"us-central1-a" suffix:"gazette-756b95d7d4-jbr4k" > members:<zone:"us-central1-b" suffix:"gazette-756b95d7d4-hbzqx" > members:<zone:"us-central1-c" suffix:"gazette-756b95d7d4-htr8b" > primary:2 endpoints:"http://10.40.2.46:8080" endpoints:"http://10.40.1.42:8080" endpoints:"http://10.40.0.34:8080" , ID: ) => zone:"us-central1-b" suffix:"gazette-756b95d7d4-hbzqx" (READY)
04:25:08.256405	 .    23	... sent: journal:"arize-edge/records/part=000" offset:6687 block:true do_not_proxy:true

Perform a kubectl rollout restart deployment gazette and the open Read GRPCs remain in the Idle state. Sometimes a few (~25% in my case) of the SubConns do successfully navigate the transition. Those that do were found to have received a NOT_JOURNAL_BROKER response in their broker/client/reader.go Read method which made them pass in a full Route to the Pick routine (not shown below as this case all Read GRPCs went Idle). This non-empty Route often is composed of new and old broker information though similar to #215 but potentially that was a temporary situation that was happening when the Pick was triggered.

2023/03/16 04:30:36.264197	300.525850	/protocol.Journal/Read
04:30:36.264201	 .     4	... RPC: to <nil> deadline:none
04:30:36.264208	 .     7	... Pick(Route: primary:-1 , ID: ) => (IDLE)
2023/03/16 04:30:36.246371	300.543793	/protocol.Journal/Read
04:30:36.246375	 .     4	... RPC: to <nil> deadline:none
04:30:36.246383	 .     8	... Pick(Route: primary:-1 , ID: ) => (IDLE)

After the change:

(ignore that extra DJD traces that dump the dispatcher structures)

Again at initial system bringup or restarting of the consumers the GRPC connections are following the expected intra-zone routing (in the case below the consumer is in zone: us-central1-c):

2023/03/16 03:27:01.396172	517.010541	/protocol.Journal/Read
03:27:01.396176	 .     4	... RPC: to 10.40.0.20:8080 deadline:none
03:27:01.396181	 .     4	... DJD Pick idConn: map[{Zone: Suffix:}:{subConn:0xc00b9c9608 mark:3}], connID: map[0xc00b9c9608:{Zone: Suffix:}], connState: map[0xc00b9c9608:READY], zone: us-central1-c
03:27:01.396188	 .     8	... DJD1 id: {Zone:us-central1-c Suffix:gazette-756b95d7d4-4f2hk}
03:27:01.396191	 .     3	... Pick(Route: members:<zone:"us-central1-a" suffix:"gazette-756b95d7d4-mq8kf" > members:<zone:"us-central1-b" suffix:"gazette-756b95d7d4-lkvkk" > members:<zone:"us-central1-c" suffix:"gazette-756b95d7d4-4f2hk" > primary:2 endpoints:"http://10.40.2.22:8080" endpoints:"http://10.40.1.23:8080" endpoints:"http://10.40.0.20:8080" , ID: ) => zone:"us-central1-c" suffix:"gazette-756b95d7d4-4f2hk" (READY)
03:27:01.396192	 .      	... DJD11
03:27:01.396228	 .    37	... sent: journal:"arize-edge/records/part=000" offset:5201 block:true do_not_proxy:true
03:28:47.935838	106.539610	... recv: offset:4458 write_head:5201 fragment:<journal:"arize-edge/records/part=000" begin:4458 end:520
03:28:47.935888	 .    50	... recv: offset:4458 write_head:5201 fragment:<journal:"arize-edge/records/part=000" begin:4458 end:520

Perform a kubectl rollout restart deployment gazette and the open Read GRPCs reattach to the brokers.
A few of them that went through the NOT_JOURNAL_BROKER handling were given a non-empty route and reattach directly to a broker pod. Note in the case below the first Pick route entry (gazette-756b95d7d4-mq8kf) is actually in the process of being replaced but is not chosen due its zone. I think a GRPC can cross zones if the new broker for that zone is not present in the list at the time of the Pick call.

2023/03/16 03:38:00.196490	6.145368	/protocol.Journal/Read
03:38:00.196494	 .     5	... RPC: to 10.40.0.23:8080 deadline:none
03:38:00.196501	 .     7	... DJD Pick idConn: map[{Zone: Suffix:}:{subConn:0xc00b9c9608 mark:25} {Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs}:{subConn:0xc00b490648 mark:24}], connID: map[0xc00b490648:{Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs} 0xc00b9c9608:{Zone: Suffix:}], connState: map[0xc00b490648:READY 0xc00b9c9608:READY], zone: us-central1-c
03:38:00.196509	 .     8	... DJD1 id: {Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs}
03:38:00.196510	 .     1	... DJD2
03:38:00.196919	 .    44	... (7 events discarded)
03:38:00.197589	 .   670	... DJD Pick idConn: map[{Zone: Suffix:}:{subConn:0xc00b9c9608 mark:25} {Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs}:{subConn:0xc00b490648 mark:24}], connID: map[0xc00b490648:{Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs} 0xc00b9c9608:{Zone: Suffix:}], connState: map[0xc00b490648:READY 0xc00b9c9608:READY], zone: us-central1-c
03:38:00.197593	 .     4	... DJD1 id: {Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs}
03:38:00.197594	 .     1	... Pick(Route: members:<zone:"us-central1-a" suffix:"gazette-756b95d7d4-mq8kf" > members:<zone:"us-central1-b" suffix:"gazette-574d6db9d6-8j575" > members:<zone:"us-central1-c" suffix:"gazette-574d6db9d6-mnwrs" > endpoints:"http://10.40.2.22:8080" endpoints:"http://10.40.1.25:8080" endpoints:"http://10.40.0.23:8080" , ID: ) => zone:"us-central1-c" suffix:"gazette-574d6db9d6-mnwrs" (READY)
03:38:00.197594	 .      	... DJD11
03:38:00.197645	 .    51	... sent: journal:"arize-edge/pre-production-records/part=000" block:true do_not_proxy:true

Most of the open Read GRPCs actually have the Pick method called with no route and connect via the default service SubConn (10.44.x.x address) which does not trigger any intra-zone rule handling to occur.

2023/03/16 03:38:00.835925	5.504142	/protocol.Journal/Read
03:38:00.835928	 .     2	... RPC: to 10.44.12.190:8080 deadline:none
03:38:00.835931	 .     3	... DJD Pick idConn: map[{Zone: Suffix:}:{subConn:0xc00b9c9608 mark:25} {Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs}:{subConn:0xc00b490648 mark:24}], connID: map[0xc00b490648:{Zone:us-central1-c Suffix:gazette-574d6db9d6-mnwrs} 0xc00b9c9608:{Zone: Suffix:}], connState: map[0xc00b490648:READY 0xc00b9c9608:READY], zone: us-central1-c
03:38:00.835933	 .     3	... DJD1 id: {Zone: Suffix:}
03:38:00.835935	 .     2	... Pick(Route: primary:-1 , ID: ) => (READY)
03:38:00.835936	 .      	... DJD11
03:38:00.835959	 .    23	... sent: journal:"arize-edge/pre-production-records/part=003" block:true do_not_proxy:true

This change is

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.11.0 to 1.11.1. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](prometheus/client_golang@v1.11.0...v1.11.1) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>

jgraettinger

Some questions below. I knew this stuff pretty well at one point, but it's mostly left my head and I need to be hand-held through it a bit more. Thanks! And sorry for the delay

jgraettinger · 2023-04-14T20:17:31Z

broker/protocol/dispatcher.go

+			sc.Connect()
+		}
+		d.mu.Unlock()
+		return


Why is it important to return here, where we don't return with the Connect() call below?
As far as I can tell, the purpose is to not call UpdateState below. Is that true, and why would it a problem to do so?

(I'm trying to figure out if this code block could be eliminated, and instead have the check below be the only place where Connect() is called).

Also, the core problem (as I understand it) is that Idle sub-connections won't actually re-connect unless sc.Connect() is explicitly called, right?

Was this a change in gRPC behavior? I'm confused how this isn't a problem we've experienced ourselves. For that matter, help me understand: how does a previously-active SubConn come to be Idle again (my very old recollection is they bounce between failure and Connecting)?

In any case, since the check below will cause Connect to be called, is there a particular reason that we don't want to update connState[sc] if it's state.Idle now?

Put another way, would the root issue would be resolve with just the addition below of:

if state.ConnectivityState == connectivity.Idle { sc.Connect() }

Almost certainly I'm missing something 🤷 .

No problem. I am going to re-look into the changes that occurred in gRPC over time, as it has been updated in our master a couple of times, to see if it plays any part. Agree that it is strange it has not popped up in your deployment.

A lot of my investigation was the evolution (via blame) in https://github.com/grpc/grpc-go/blob/master/balancer/base/balancer.go#L180 as we appeared to be lined up pretty closely with the overall pattern of UpdateSubConnState() but they have been tweaking it over time.

I will get back with hopefully good answers to your questions and your simplification may be valid.

Still looking but i think this gRPC PR grpc/grpc-go@03268c8#diff-f9772420643575b997f617c6d4c1934aaa26f057042c7fa71a521ef5bb2af253 is the one where gRPC have introduced the transition to the Idle state and adjusted their own example loadbalancer to work with that.

In their example balancer/base/balancer.go they were returning in the top check for Idle state and bypassing the call to UpdateState() so i followed that. I can dig deeper on that to see why that may or may not make a difference.

In looking deeper i think this change above may be in a later version of gRPC than the gazette repo is using (as gazette go.mod shows v1.40.0). Our application mono-repo uses gRPC in a lot of services and our go.mod is at v.1.52.0. It may explain why you do not see this issue and we do (in our consumers). Let me confirm the exact version that the commit above shows up.

It seems like the gRPC change above was for the 1.41.0 release which the gazette repo does not yet use. They spell out the behavior changes here: https://github.com/grpc/grpc-go/releases/tag/v1.41.0 eventhough the commit i link to is not explicitly called out it follows the balancer #4613 PR they mention.

So my PR may be premature for wider release.

Will talk to Michael when he comes back next week to see if we should close this PR to master.

@jgraettinger i guess these(or similar) changes will be required when gRPC is upgraded to 1.41+. For now i changed the title to reflect this. I will leave open for now but will close if you think that is best.

…om/prometheus/client_golang-1.11.1' into broker-restart-reconnect

…nnect

ddowker · 2023-11-06T21:52:11Z

Just an FYI. Only the dispatcher.go changes relate to this PR. We somewhat polluted this PR by merging in open PR #331 (go.mod, go.sum) into our fork (as we required it also in our product). It should have been done on a new branch on our fork.

jgraettinger · 2023-11-17T18:07:38Z

Thanks, sorry for the long delay here. We've got a further PR that builds upon this PR to update gRPC to 1.59, that we've been testing. We're planning to land both this and that PR on Monday.

ddowker · 2023-11-17T18:13:12Z

Thanks for the heads up.

dependabot bot and others added 2 commits February 15, 2023 02:41

align with grpc base/balancer to trigger reconnect in Idle state

9a244c0

jgraettinger reviewed Apr 14, 2023

View reviewed changes

ddowker changed the title ~~align with grpc base/balancer to trigger reconnect in Idle state~~ align with grpc base/balancer to trigger reconnect in Idle state (when we move to gRPC 1.41+) Apr 27, 2023

ddowker added 3 commits September 22, 2023 11:10

Merge branch 'master' into broker-restart-reconnect

428e6ab

Merge remote-tracking branch 'upstream/dependabot/go_modules/github.c…

bb9682e

…om/prometheus/client_golang-1.11.1' into broker-restart-reconnect

Merge remote-tracking branch 'origin/master' into broker-restart-reco…

048cd39

…nnect

jgraettinger mentioned this pull request Oct 26, 2023

go mod: update grpc module to latest 1.59.0 #352

Merged

mdibaiee merged commit def1636 into gazette:master Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

align with grpc base/balancer to trigger reconnect in Idle state (when we move to gRPC 1.41+) #335

align with grpc base/balancer to trigger reconnect in Idle state (when we move to gRPC 1.41+) #335

ddowker commented Mar 16, 2023 •

edited by jgraettinger

Loading

jgraettinger left a comment

jgraettinger Apr 14, 2023

jgraettinger Apr 14, 2023

ddowker Apr 14, 2023

ddowker Apr 17, 2023 •

edited

Loading

ddowker Apr 17, 2023 •

edited

Loading

ddowker Apr 18, 2023

ddowker Apr 27, 2023

ddowker commented Nov 6, 2023

jgraettinger commented Nov 17, 2023

ddowker commented Nov 17, 2023

align with grpc base/balancer to trigger reconnect in Idle state (when we move to gRPC 1.41+) #335

align with grpc base/balancer to trigger reconnect in Idle state (when we move to gRPC 1.41+) #335

Conversation

ddowker commented Mar 16, 2023 • edited by jgraettinger Loading

Prior to the change:

After the change:

jgraettinger left a comment

Choose a reason for hiding this comment

jgraettinger Apr 14, 2023

Choose a reason for hiding this comment

jgraettinger Apr 14, 2023

Choose a reason for hiding this comment

ddowker Apr 14, 2023

Choose a reason for hiding this comment

ddowker Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

ddowker Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

ddowker Apr 18, 2023

Choose a reason for hiding this comment

ddowker Apr 27, 2023

Choose a reason for hiding this comment

ddowker commented Nov 6, 2023

jgraettinger commented Nov 17, 2023

ddowker commented Nov 17, 2023

ddowker commented Mar 16, 2023 •

edited by jgraettinger

Loading

ddowker Apr 17, 2023 •

edited

Loading

ddowker Apr 17, 2023 •

edited

Loading